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1,  Introduction 


This  paper  deals  with  estimating  regression  coefficients  in  the 
usual  linear  model.  Let  y  be  a  T-component  random  vector  with 
expected  value 

(1)  £y  =  zB  , 

where  Z  is  a  T  x  p  matrix  of  numbers  and  3  is  a  p-component 
vector  of  parameters.  (All  vectors  are  column  vectors.)  For  con¬ 
venience  we  assume  that  the  rank  of  Z  is  the  number  of  columns,  p  . 
The  covariance  matrix  of  y  is 

(2)  t( y>  =  Ul  ~  2B)(X  -  Z3 )»  -  S  . 

'(Transpositioneof  a  vector  or  matrix  is  denoted  by  a  prime. )  Again 
for  convenience  we  shall  assume  that  E  is  positive  definite.  The 
problem  is  to  estimate  3  on  the  basis  of  one  observation  on  £ 
when  Z  is  known. 

When  E  is  known  or  is  known  to  within  a  constant  multiple,  the 

i 

Markov  or  Best  Linear  Unbiased  Estimate  (BLUE)  is  given  by 

j 

(3)  b  =  (Z'E"1Z)_1  Z'Z_1y  . 

The  least  squares  estimate  is  given  by 

(4)  b*  =  (Z’Z)-1  Z'y  . 
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The  covariance  matrix  of  the  Markov  estimate  is 


(5)  £(b)  =  (Z'S  h)  1  ; 

the  covariance  matrix  of  the  least  squares  estimate  is 

(6)  £(b*)  =  (z'-z)“1z!£z(z'z)“1  . 

Both  of  the  estimates  are  linear  and  unbiased. 

The  optimality  property  of  the  Markov  estimate  implies  that 
i?(b*)  -  t!(b)  is  positive  semidef inite;  that  is,  any  linear  function 

of  the  Markov  estimate  has  a  variance  no  larger  than  the  variance  of 
that  linear  function  of  the  least  squares  estimate.  Since  the  least 
squares  estimate  can  always  be  calculated,  but  the  Markov  estimate  is 
unavailable  if  the  covariance  matrix  2  is  now  known  to  within  a 
constant  of  proportionality,  an  interesting  problem  is  to  find  under 
what  conditions  the  least  squares  estimate  is  identical  to  the  Markov 
estimate.  It  will  be  noted  that  they  are  identical  when  £  is  a 
multiple  of  the  identity,  I  .  The  general  answer  is  given  by  the 
following  theorem: 

Theorem  1.  The  least  squares  estimate  (4)  is  identical  to  the  best 
linear  unbiased  estimate  (3)  if  and  only  if  Z  =  V*C  ,  where  the  p 
columns  of  V*  are  p  linearly  independent  characteristic  vectors  of 
E  and  C  is  a  nonsingular  matrix. 

The  sufficiency  of  the  condition  was  essentially  given  by  myself 
in  1948  in  the  Skandinavisk  Aktuarietidskrif t  [1],  In  that  paper  I 
showed  that  if  y  is  normally  distributed,  then  the  least  squares 
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estimate  is  identical  to  the  maximum  likelihood  estimate;  under  normality, 
of  course,  the  maximum  likelihood  estimate  is  best  linear  unbiased, 

Watson  [9],  [10],  G,  S.  Watson  and  Hannan  [12]  studied  the  efficiency 
of  least  squares  estimates;  the  inequality  given  in  the  first  two  papers 
shows  the  necessity  of  the  condition  for  p  =  1  .  Magness  and  McGuire  [6] 
rediscovered  the  condition,  proving  sufficiency  and  necessity.  Watson  [11] 
and  Zyskind  [13]  have  made  more  intensive  studies  and  surveyed  the 
literature, 

A  problem  that  is  more  explicitly  and  specially  a  time  series  problem 
occurs  in  the  case  where  the  residuals  constitute  a  stationary  stochastic 
process.  The  property  0  =  q(s  -  t)  .,  where  E  =  (a  )  denotes  station- 

Sl  St 

arity  in  the  wide  sense.  In  general,  the  least  squares  estimate  and  the 
best  linear  unbiased  estimate  will  be  different.  The  characteristic 
vectors  of  E  depend  on  the  values  of  the  serial  or  lag  covariances 
and  hence  the  best  linear  unbiased  estimate  depends  on  these  parameters, 
which  are  generally  unknown. 

In  this  case  we  consider  the  covariance  matrices  of  the  estimates, 
normalize  them  suitably  and  identically,  and  consider  the  limits  of  them 
as  T-k»  .  Grenander  [4],  Rosenblatt  [7]  in  the  Third  Berkeley  Symposium, 
and  these  two  authors  [5]  found  conditions  for  which  the  two  limiting 
covariance  matrices  are  the  same.  They  did  not  indicate  that  their 
results  were  asymptotic  analogues  of  the  result  for  a  finite  sample, 
and  the  statement  of  their  results  and  their  methods  of  proof  do  not 
make  it  easy  to  see  the  relationship 
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In  this  paper  I  shall  prove  the  results  for  the  finite-dimensional 
case  and  the  limiting  case  in  a  similar  fashion  in  order  that  the  rela¬ 
tionship  between  the  results  be  clearer  and  that  the  asymptotic  results 
be  more  easily  understood.  The  emphasis  here  is  on  the  linear  algebra; 
the  rigorous  derivation  of  the  limits,  which  is  rather  involved  is 
omitted  (but  is  given  in  Section  10.2  of  [3]). 

The  method  of  proof  is  not  the  most  direct  for  Theorem  1,  because 
the  proof  uses  covariance  matrices  instead  of  the  structure  of  the 
estimates  themselves.  On  the  other  hand,  the  asymptotic  results  must 
be  derived  in  terms  of  the  covariance  matrices  because  the  order  of 
the  observation  vector  increases,  and  thus  the  structure  of  the  estimate 
changes.  To  obtain  comparable  proofs,  covariance  matrices  must  be 
used  throughout.  A  by-product  of  my  proof  of  the  theorems  is  a  different 
statement  of  the  conditions  of  Grenander  and  Rosenblatt,  which,  I  hope, 
is  more  enlightening  than  the  original.  Watson  [9]  related  the  two  sets 
of  results  by  considering  the  finite-sample  case  in  the  framework  of 
the  approach  of  Grenander  and  Rosenblatt, 

2 .  The  Finite-Sample  Case 

We  shall  now  proceed  to  prove  Theorem  1  by  considering  the  conditions 
for  which  the  two  covariance  matrices,  (5)  and  (6),  are  identical.  To 
study  this  problem  it  will  be  convenient  to  transform  the  coordinate 
system  in  the  T-dimensional  space  to  the  coordinate  system  defined  by 
the  characteristic  vectors  of  the  covariance  matrix  E  ,  Let 
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where  A^  _>  A^  _>  ...  _>  AT(>  0)  are  the  characteristic  roots  of  E 
Let  V  be  a  T  x  T  matrix  with  columns  as  corresponding  normalized 
characteristic  vectors.  These  properties  can  be  summarized  in  the  two 
matrix  equations 


(8) 

EV  =  VA 

(9) 

V'V  =  I 

which  imply  E  =  VAV  and  I  =  VV'  .  We  can  refer  the  matrix  of  indep- 

A*  A/  A. 

endent  variables  to  this  coordinate  system.  Then 
(10)  Z  =  VG  , 

where 


(11) 


and  is  a  p-component  vector,  t  =  1,  ...  ,  T  . 

matrices  depend  on  three  matrices  involving  Z  and 
written  in  terms  of  A  and  G  as 

T 

(12)  Z'Z  =  G'V'VG  =  G'G  =  g  g' 

~  ~  ~  ~  -  ~  ~  .  ~t~t 

t=l 


The  two  covariance 
E  .  These  can  be 


> 
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(13) 


T 

Z'EZ  =  G'V'EVG  =  G'AG  =  3”  X  g  g'  , 

-  -  -  ~t2t~t 

T 

(14)  Z'E-1Z  =  G'V'E-1VG  =  G'A-1G  =  3_  J-gg1  . 

.  3a  At 

The  columns  of  V  are  characteristic  vectors  of  E  ^  corresponding  to 
roots  which  are  the  reciprocals  of  the  characteristic  roots  of  E  .  We 
shall  follow  these  matrices  along. 

The  characteristic  roots  may  not  all  be  different.  Let  us  indicate 
the  multiplicity  of  the  roots  by  writing  the  diagonal  matrix  A  in  the 
partitioned  form 


where  v,  >  v„  >  ...  >  Vu  (>  0)  are  the  different  characteristic  roots. 
12  h 

The  orders  of  the  diagonal  blocks  are  the  multiplicities  of  the  corres¬ 


ponding  roots,  say  m^ ,  m^ ,  . 
V  and  G  similarly, 


m . 


H 


3  ^=1  m^  =  •  We  partition 


(16) 


V  =  iv(1\  v(2),  ...  ,  , 


,  (1) 


(17) 


G  = 


,(2) 


v  d 


Now  let  us  go  back  to  the  matrices  we  considered  previously,  and  express 
them  in  these  new  terms.  Z  is  written  as 


(18) 


Z  =  V(h)G(h) 


h=a 


(19) 


(20) 


z'z  = 


Z'EZ  = 


The  three  matrices  appearing  in  the  covariance  matrices  are 

H 

21 

h=l 

H 

X 

h=l 
H 


G<h>'G<h) 


VkG(h)'G(h> 

h~ 


(21) 


Z'E  1Z  = 


l_G(h)'G(h) 

h=l  vh  ~ 


The  definition  of  a  submatrix  of  V  may  have  some  indeterminacy  in  it. 

r-,  i  TI  (h)  „(h)  (h)  _(h)  ,  0(h)'„(h) 

We  can  replace  V  by  V  Q  and  replace  G  by  Q  G  , 

(h  ^ 

where  Q  '  is  an  orthogonal  matrix  of  order  m  .  Such  a  transforma- 
~  h 

tion  leaves  each  of  the  last  four  equations  invariant. 

Theorem  1  shall  be  shown  to  be  equivalent  to  the  following  theorem: 


Theorem  2 .  £(b)  =  £(b*)  if  and  only  if 

(22)  p  (G(h) )  =  p  , 

h=l 

where  p(G^^)  denotes  the  rank  of  G . 

In  order  to  simplify  the  study  of  the  conditions  for  the  equality 
of  the  covariance  matrices,  it  is  convenient  to  transform  the  matrices 
again.  Let  P  be  a  nonsingular  matrix  such  that 
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(23) 

(24) 


P'  (Z'Z)P  =  I  , 


P'  (Z'EZ)P  =  D  , 


where  D  is  a  diagonal  matrix  with  d_ ,  >  d„.  >  ...  >  d  >  0  .  [These 

~  11  —  2z  —  —  pp  — 

are  the  characteristic  roots  of  Z'EZ(Z'Z)  ^  .]  Let  us  also  make  the 

transformation  of  the  other  matrix,  P'(Z'E  ^Z)P  .  The  covariance  matrix 
of  P  is  the  inverse  of  this  last  matrix.  The  covariance  matrix  of 

P  H>*  is  D  .  (This  can  be  seen  from  the  original  expression  for  the 
covariance  matrix  of  b*  ,  (6),  by  multiplication  on  the  left  by  P  ^ 
and  on  the  right  by  P'  ^  and  with  use  of  the  properties  of  the  matrices 
we  have  just  discussed.)  The  question  of  equality  of  the  original  covar¬ 
iance  matrices  has  now  been  reduced  to  the  problem  of  when  the  covariance 
matrix  of  P  ^b  is  D  . 

The  three  matrices  in  f(b)  and  £(b*)  can  be  written 

H 


(25) 


(26) 


(27) 


I  =  P'Z'ZP  =  C 

~  ~  ~~  h=l  ~ 


(h) 


H 


D  =  P'Z'EZP  = 


h=l 


(h) 


H 


—  1  1 

P'Z'E  ZP  =  - 

-  A  \ 


,  (h) 


h=l  Vh 


(h) 


where  C(h)  =  P'G^  G^P  .  Note  that  p^C(h)j  =  P(G'*7  ’  Let  us 

consider  the  diagonal  elements  of  each  of  the  last  three  equations. 
They  are 

H 


(28) 


i-  si 


h=l 


.(h) 


n 
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(29) 


d..  =  X 
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h=l 


(h) 
V.  C.  . 
h  11 


(30) 


H  ^c(h) 

h-1  Vh  11 


Since  the  matrix  C;  is  positive  semidef inite,  each  diagonal  element 

is  nonnegative.  For  each  i  the  sum  of  these  nonnegative  components  is 

1  ;  hence,  the  elements  in  the  i-th  diagonal  position  can  be  considered 

as  probabilities.  Let  be  a  random  variable  that  takes  on  the  value 

(h) 

V,  with  probability  c..  ,  h  =  1,  ...  ,  H  .  Then  d..  is  the  expected 

h  ii  n 

value  of  this  random  variable.  The  last  expression  is  the  expected  value 

of  the  reciprocal  of  this  positive  random  variable.  If  the  two  covariance 

matrices  are  to  be  the  same,  the  i-th  diagonal  element  of  the  last  matrix 

must  be  the  reciprocal  of  that  diagonal  element  of  the  second  matrix. 

Thus,  the  random  variable  just  defined  can  take  on  only  one  value  with 

probability  1  .  (This  is  basically  the  condition  for  equality  in  the 

Cauchy-Schwarz  inequality.)  This  implies  that  for  each  i  ,  cf^  =  1 

for  one  index  h  and  is  0  for  other  values  of  h  because  the  V, 

h 

are  distinct.  These  facts  imply  that  the  diagonal  elements  of  the  matrices 

(h)  _  -i  t  _  _ ]  n  I  „  Ti _ _ „  r’(h) 


C'“'  are  l's  and  0's 
follows : 


The  matrices  C 


have  diagonal  elements  as 


(31) 


,(1)  = 


1 

*1 

0 

•  e(2)' 

0 

’■°  i 

’*0 

i-H 

0 

o 

o 

J 

o 

J 

9 


/"h  N 

If  a  matrix  CV  has  1  in  the  i-th  diagonal  position,  the  other 

matrices  have  0  in  that  position.  (Then  d . .  =  V,  .  Since  the  V 

11  h  h 

and  d..'s  are  numbered  in  descending  order,  the  l's  in  are 

in  the  upper  left-hand  corner,  etc.) 

Some  matrices  may  only  have  0's  on  the  main  diagonal.  Since  C 
is  positive  semidef inite ,  a  diagonal  element  of  0  implies  that  the 
entire  corresponding  row  and  column  are  0  .  Thus 


! 


s 


(h) 


f 


(32)  C 


(1) 


*1 


L 


.(h), 


c^  =  :  o 


o 


0  0 
1 


0 

0 


•1 


0 


u. 


Since  the  C  's  sum  to  I  ,  and  the  nonzero  blocks  are  not  overlapping 


(33)  C 


(1)  .. 


0 


0  0 


I 


s 


,(2) 


L- 


0 

0 

0 


0 


0 

0 

0 


We  have  then  with  an  identity  in  the  upper  left-hand  corner  and 

so  on.  The  rank  of  each  Cv  is  equal  to  the  number  of  diagonal  elements 
that  are  1  .  Thus,  the  sum  of  the  ranks  is  equal  to  p  .  Therefore, 
the  equality  of  the  covariance  matrices  implies  that  the  sum  of  the 
ranks  is  p  . 

The  converse  can  be  obtained  by  use  of  Cochran's  theorem.  (See 
Lemma  7.4.1  of  [2],  for  example.)  However,  we  shall  use  a  simplified 
proof  of  a  generalization  of  one  part  of  Cochran's  Theorem  due  to 
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(Y\  ^ 

Styan  [8].  We  assume  the  sum  of  the  ranks  of  the  g  's  is 


Let 


the  nonnull  C 


(h), 


s  be  L  ,  . . ,  ,  L  (K  <  p)  and  let  the  ranks  of  these 
~K  — 


matrices  be  r.  ,  ...  ,  rT_  ,  respectively.  Then  L.  can  be  written 
i  K 

A!  A.  ,  where  A  is  r.xp,j  =  l . K.  Let  U  be  the  diagonal 

~J  ~J  J 

matrix  with  j-th  diagonal  blocks  of  order  r  consisting  of  u  I  , 

J  j~ 

respectively,  where  u.  is  the  j-th  value  of  v, ,  ...  ,  corres- 

J  1  H 


ponding  to  a  nonnull  C 


(h) 


,  j  1,  ...  ,  K  .  Let 


(34) 


A  = 


I  M 


\  ^  / 


Then  (25)  and  (26)  are 


(35) 


I  =  A  A  , 


(36) 


D  =  A'UA  . 


K 


Equation  (35)  shows  that  A  is  orthogonal  as  r.  =  p  ,  and  so  it 

j=l  J 

follows  from  (36)  that 


(37) 


D_1  =  A'U  1A  , 


which  is  (27).  Since 


(38) 


H 

z 

h=l 


p(c(h))  = 


K 


i=l  J 


=  P 


Theorem  2  is  proved.  (That  equality  of  covariance  matrices  implies  the 
rank  condition  can  be  proved  by  the  method  used  in  the  converse,  but  it 
does  not  generalize  directly  to  the  case  of  stationary  residuals.) 


11 


— 1 
j 


As  was  indicated  earlier,  G^'*  in  Z  =  3*  ,  can  be 

replaced  by  G^^  where  is  orthogonal  In  particular, 

can  be  chosen  so  that  G^^  has  as  many  nonzero  rows  as  its  rank. 
(For  the  nonnull  C^^’s  or  G^^'s  ,  the  resulting  matrices  are 

. Aj,  . )  This  proves  Theorem  1  for  the  finite-dimensional  case. 


3 .  Large-Sample  Theory  for  Stationary  Residuals. 

We  now  turn  to  the  problem  involving  stationary  time  series.  The 
elements  of  the  covariance  matrix  of  y  are 

(39)  a  =  a(s  -  t)  =  f  e1(s~t)Xf(X)  dX  , 

st  / 

J  -TT 

where  f(A)  is  the  spectral  density,  which  is  assumed  to  exist.  Also 
we  assume  that  the  spectral  density  satisfies  the  inequalities 


(40) 


0  < 


f-< 

271  — 


f  (X)  < 


M_ 

2tt 
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when  m  and  M  are  some  positive  constants.  In  developing  the 
asymptotic  theory  I  shall  not  attempt  to  state  all  of  the  conditions. 
(They  are  given  in  Section  10.2.3  of  [3].)  We  write 

00 

(41)  f(A)  =  j-  XI  e:  a(h)  . 

h=-co 


Let  the  diagonal  matrix  by  defined  by 

(42)  diag  (Z'Z)  =  diag  , 

where  we  take  the  positive  square  roots.  Since  we  are  interested  in 
T-*°°  ,  we  shall  use  the  index  T  when  convenient  to  emphasize  that  we 
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have  a  sequence  of  estimates.  The  suitable  normalization  of  the  estimates 
is  multiplication  by  this  matrix  D  .  We  consider  the  limits  of  the 
covariance  matrices  of  D^b  and  of  D^b*  .  The  question  is  what  are 
necessary  and  sufficient  conditions  on  the  independent  variables  and 
the  spectral  density  such  that 

(43)  lim  C(Db)  =  lim  t(D  b*)  . 

T-*»  ~  L~  x-*»  ~  i~ 


Let 


(44)  Z*  =  (zr  ...  ,  zT)  . 

Consider  the  sum  on  t  of  z  ,,  z'  and  multiply  on  each  side  by  D  ^ 

~t+h~t  r  J  ~T 

to  obtain  the  matrix  of  lagged  correlations  of  order  h  .  Let  the  limit 
of  this  matrix  as  T^°°  be 

(45)  j(h)  ■  iim  S1  21  jt+h!;?!1  • 

T-x»  t 

We  assume  that  these. limits  exist  for  t  =  0,  +  1,  +  2,  ...  .  Then 
this  sequence  of  matrices  has  the  spectral  representation 

(46)  R(h)  = 

where  M(X)  has  complex-valued  elements,  is  Hermitian,  and  has  incre¬ 
ments  that  are  positive  semidefinite. 

We  shall  now  consider  the  limits  of  the  covariance  matrices  of  the 
normalized  estimates.  Those  covariance  matrices  involve  the  limits  of 
the  matrices  D^Z' ZD^  ,  D^Z ' ZZD^  ,  and  D^ZE  .  In  fact, 


-TT 


1  -TT 


iAh  ,,,,,  . 
e  dM  (A )  , 
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(47) 

lim 

£(PTb)  = 

lim  (D'Vz  1ZD  V1  , 

T-h» 

T-x»  ~i  ~  ~  T 

(48) 

lim 

t(DTb*) 

=  R-1 (0)  lim  D“1Z,EZD“1R-1(0) 

T-hjo 

T-k»  ~i  ~  ~  1  ~ 

The  second  matrix  is 


T-l  CO 

(49)  lim  D  1  z  z'  a  (h)D  1  =  >  R(h)a(h) 

T^x,  ~i  h=-  (T-l)  t  ~  h=-°°  ~ 


_7T  00 

j  '^EZ  a(h)elAhdM(A) 
I  -xr  h=-co 


-IT 


2-rrf  (A)  dll  (A)  . 


J  — TT 

Of  course,  these  operations  need  to  be  justified  to  give  a  rigorous 
proof,  but  that  requires  considerable  detail.  The  full  proof  is  given 
in  section  10.2.3  of  my  book  [3]  and  is  along  the  lines  indicated  by 
Grenander  and  Rosenblatt  [5].  The  three  matrices  we  are  interested  in 
can  be  written 


(50) 

(51) 

(52) 


-1 

,  -1 

r 

lim 

?T 

Z'ZDm  = 

~  ~~T 

dM(A)  , 

Thkx. 

•7-Tr 

-1 

,  -1 

lim 

D 

Z'ZZD  = 

_ T 

2t r  f (A) 

dM(A)  , 

~  J. 

~  ~~~  ± 

J— TT 

-1 

,  “I  “I 

r  i 

lim 

Dm 

Z  £  ZD 

_ 

■  dM  (A ) 

Thkx. 

~T 

~  ~  ~~T 

1  2-nrf  (A) 

>-TT 

The  derivation  for  the  third  matrix  is  an  involved  demonstration  also 
given  in  [3].  These  three  expressions  are  the  analogues  of  (12),  (13), 
and  (14)  in  the  finite-dimensional  case.  Carrying  the  analogy  to  the 
finite-dimensional  case  further,  we  shall  write  these  integrals  in 
another  manner  to  resemble  (19),  (20),  and  (21).  Let 
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(53) 

(54) 


S(u)  =  {X|27Tf  (X)  £  u}  , 

T (u)  =  /  dM(X)  . 

S(u) 


m  <  u  <  M  , 


The  component  functions  of  T(u)  are  real.  Then  our  three  matrices  can 
be  written  as 


(55) 

(56) 

(57) 


lim  D^Z'ZD"1  =  f 


lim  D  VZZD  1  = 


lim  D^ZE  XZD  1 

X-^oo  ~ 1  '  -1 


M 


dT(u)  , 


m 


■M 


udT(u)  , 


m 


M 


->  m 


„  dJ(u)  ■ 


Similar  to  the  finite-dimensional  case  we  let  P  be  a  nonsingular 
matrix  such  that 


(58) 


(59) 


P'  dT (u)  P  =  I  , 


P'  f  udT(u)  P  =  D  , 


where  D  is  diagonal  and  d, ,  >  d.„  >  . . .  >  d  >0.  The  same  trans- 

~  11  —  22  —  —  pp  — 


formation  is  applied  to  the  third  matrix, 


fM  -1 


u  dM(u)  ,  which  is  the 


m 


inverse  of  lim.  ^(D  b)  .  The  other  limiting  covariance  matrix  is 
'£-*«  ~T~ .  .... 

llmT^o  =  ?  • 


If  we  let 


(60)  L (u)  -  P'T(u)P  , 

then  the  three  matrices  of  interest  are 
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(61) 


(62) 


(63). 


I  = 


D  = 


lim  P'D'1  Z’ZD'1  P 

<V  <V  I  A/A/  I  <V 

T-*oo 


lim  P'D"1  Z'ZZD  1  P 

T-*»  ~  ~1  ~  T  ~ 


lim  P'D'1  Z'E  1ZD"1  P 
T-*»  ~  ~  1 


'I 


M 

m 

M 

m 

•M 

m 


dL(xi)  , 

u  dL(u)  , 


-  dL (u) 
u  ~ 


A  diagonal  element  of  L(u)  ,  say  £^(u)  >  has  t*16  Properties  of  a 


cumulative  distribution  function.  The  corresponding  diagonal  elements 

r  M 

of  (62)  and  (63)  are  ud£  (u)  and 

ii 

*>  m  J  m 

the  expected  values  of  the  random  variable  with  this  distribution  and 


u  \i£  (u)  ,  which  are 

ii 


its  reciprocal.  Thus,  if  the  two  limiting  covariance  matrices  are  equal, 
the  matrix  (63)  is  the  inverse  of  (62)  and 


(64) 


fM 

rM 

ud£ . . (u)  = 
n 

J  m 

* 

-  d£ .  (u) 
u  iiv 

m 

_ 

-1 


this  implies  that  £^(u)  ^as  one  P°int  of  increase  and  the  increase  is 


1  at  this  point. Let  the  points  of  increase  be  u^  >  u^ 


‘  >  UK  >  °  ’ 


and  let  L  be  the  increase  of  L(u)  at  u  ,  j  =  1 . K.  Then 

~J  1 

the  three  matrices  can  be  written 


(65) 

(66) 

(67) 


I  =  lim  P'D^1  Z'Zd"1  P 

p-n»  ~  ~  ~  ~ 


K 


D  =  lim  P ' d'1  Z'ZZDt1  P 

X~h»  ~  ~  ~  ~ 


-Z  1.. 

3=1  ~3 
K 


=  H  , 

j  =  l  3  3 

-i  -i-i  ^  -i 

lim  P'D"  Z'Z  ZD  P  =  3~~  —  L 
~  ~T  T  ^ —  n 


p-*oo 


3=1  U3  ~3 


We  are  now  back  to  the  same  forms  that  we  had  for  the  finite-dimensional 
case,  (25),  (26),  (27)  ,  The  only  difference  is  that  in  the  earlier 
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(h) 

case  we  had  not  culled  out  the  vacuous  matrices  Cv  .  From  this 
point  on  the  reasoning  is  the  same.  The  matrices  L  ,  L  ,  ...  ,  L 
have  the  form  of  (33);  that  is,  the  diagonal  blocks  are  I's  and  0's 
and  off-diagonal  blocks  are  0's  . 

The  converse  is  similar  to  the  finite-dimensional  case.  If  L(u) 
has  K  points  of  increase  and  the  sum  of  the  ranks  of  the  increases 
is  p  (and  the  increases  are  positive  semidefinite  with  sum  of  I 
and  weighted  sum  of  D  ),  then  by  the  previous  reasoning,  they  are  of 
the  form  (33)  and  (67)  is  D  ^  .  We  put  these  properties  in  terms  of 
M(A)  and  summarize  them  in  a  theorem. 

Theorem  3.  The  limiting  covariances  of  D^b  and  D^b*  are 
identical  if  and  only  if  f (A )  takes  on  no  more  than  p  values  on 
the  set  of  A  for  which  M(A)  increases  and  the  sum  of  the  ranks  of 
J  M (A )  over  the  sets  of  A  for  which  f (A )  takes  on  these  values 
is  p  . 

The  set  of  A  for  which  Ii(A)  increases  is  called  the  spectrum  of 
M(A)  .  The  sets  of  A  for  which  f(A)  assumes  its  values  are  called 
the  elements  of  the  spectrum.  The  properties  of  L  ,  . . .  ,  (idem- 

potent  and  orthogonal)  determine  these  sets;  Grenander  and  Rosenblatt 
used  them,  though  indirectly. 

When  the  residuals  are  uncorrelated,  f(A)  =  cr  (0 )  /  (2tt )  and  the 
conditions  of  Theorem  3  are  satisfied.  However,  we  may  be  interested 
in  conditions  on  the  independent  variables  also  which  insure  that  least 
squares  be  asymptotically  efficient  regardless  of  f (A )  • 
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Theorem  4.  The  limiting  covariances  of  D„b  and  D  b*  axe  iden- 
tical  for  all  stationary  processes  with  spectral  densities  which  are 
bounded  and  bounded  away  from  0  if  and  only  if  M(A)  increases  at 
not  more  than  p  values  of  A,  0  <_  A  <_  TT  ,  and  the  sum  of  the  ranks 
of  the  increase  in  M(A)  _is_  p  . 


If  the  number  of  points  at  which  M(A)  increases  is  at  most  p 
( 0  <  A  <  TT )  ,  then  the  spectrum  of  M(A)  consists  of  these  p  points 
and  their  corresponding  negative  values.  The  spectral  density  (which 
is  symmetric)  can  then  take  on  at  most  p  values,  namely,  its  values 
at  these  p  points  (0  <  A  <  7t)  On  the  other  hand  if  M(A)  increases 
at  more  than  p  points  (0  <_  A  _<  tt)  then  an  f  (A)  can  be  constructed 
so  that  it  takes  on  more  than  p  points. 

An  example  of  independent  variables  {z_.t}  such  that  M(A)  has 
one  point  of  increase  is  z=t^^,j=l,  ...  ,p,  t=l,  2,  ...  ; 
the  jump  is  at  0  and  the  increase  in  M(A)  at  A  =  0  is  a  positive 
definite  matrix 


(68) 


M  = 

0  1  j  +  k  -  1 


In  this  case  R(h)  =  ,  h  =  0,  +  1  ,  ...  .  If 


H 


(69) 


z  =  an  +  V  (a.  cos  v.t+3.  sin  v.t)  , 
t  ~0  fiy  ~3  J  J  J 


then  M(A)  has  an  increase  of  rank  1  at  A  =  0  and  an  increase  of 
rank  2  as  A  =  V.  (with  0<v.<7T)  ,  j  =  1,  ...  ,H.  In  these 
examples  the  spectral  distribution  function  of  each  independent  variable 
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is  a  pure  jump  function,  which  can  be  considered  as  the  opposite  of  a 
density.  Trigonometric  functions  act  like  characteristic  vectors  of  a 
covariance  matrix  in  the  sense  that  they  are  involved  in  spectral  repre¬ 
sentation.  Comparison  of  Z  =  VAV'  and  (39)  suggests  that  columns  of 
V  correspond  to  functions  e‘'’^S  ,  the  diagonal  components  of  A  correspond 
to  the  values  of  2tt f  (A )  ,  and  summation  with  respect  to  the  index  of 
diagonal  components  of  A  corresponds  to  integration  with  respect  to 
X/  (2ir)  .  The  analogue  of  V'XV  =  A  is  (41),  which  involves  a  limiting 
procedure. 
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